首页> 外文OA文献 >Recognition of prokaryotic promoters based on a novel variable-window Z-curve method
【2h】

Recognition of prokaryotic promoters based on a novel variable-window Z-curve method

机译:基于可变窗口Z曲线方法的原核启动子识别

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Transcription is the first step in gene expression, and it is the step at which most of the regulation of expression occurs. Although sequenced prokaryotic genomes provide a wealth of information, transcriptional regulatory networks are still poorly understood using the available genomic information, largely because accurate prediction of promoters is difficult. To improve promoter recognition performance, a novel variable-window Z-curve method is developed to extract general features of prokaryotic promoters. The features are used for further classification by the partial least squares technique. To verify the prediction performance, the proposed method is applied to predict promoter fragments of two representative prokaryotic model organisms (Escherichia coli and Bacillus subtilis). Depending on the feature extraction and selection power of the proposed method, the promoter prediction accuracies are improved markedly over most existing approaches: for E. coli, the accuracies are 96.05% (σ70 promoters, coding negative samples), 90.44% (σ70 promoters, non-coding negative samples), 92.13% (known sigma-factor promoters, coding negative samples), 92.50% (known sigma-factor promoters, non-coding negative samples), respectively; for B. subtilis, the accuracies are 95.83% (known sigma-factor promoters, coding negative samples) and 99.09% (known sigma-factor promoters, non-coding negative samples). Additionally, being a linear technique, the computational simplicity of the proposed method makes it easy to run in a matter of minutes on ordinary personal computers or even laptops. More importantly, there is no need to optimize parameters, so it is very practical for predicting other species promoters without any prior knowledge or prior information of the statistical properties of the samples.
机译:转录是基因表达的第一步,也是大多数表达调节发生的步骤。尽管测序的原核生物基因组提供了大量信息,但是使用现有的基因组信息仍然对转录调控网络了解甚少,主要是因为难以准确预测启动子。为了提高启动子识别性能,开发了一种新颖的可变窗口Z曲线方法来提取原核启动子的一般特征。这些特征通过偏最小二乘技术用于进一步分类。为了验证预测性能,将所提出的方法应用于预测两种代表性原核模型生物(大肠杆菌和枯草芽孢杆菌)的启动子片段。根据所提方法的特征提取和选择能力,与大多数现有方法相比,启动子预测准确性显着提高:对于大肠杆菌,准确性分别为96.05%(σ70启动子,编码阴性样品),90.44%(σ70启动子,非编码阴性样本),92.13%(已知的sigma因子启动子,编码阴性样本),92.50%(已知σ因子启动子,非编码阴性样本);对于枯草芽孢杆菌,准确性为95.83%(已知的sigma因子启动子,编码阴性样品)和99.09%(已知的sigma因子启动子,非编码阴性样品)。另外,作为一种线性技术,提出的方法的计算简便性使得它可以在几分钟之内轻松地在普通个人计算机甚至笔记本电脑上运行。更重要的是,不需要优化参数,因此在没有其他先验知识或样本统计特性信息的情况下预测其他物种启动子非常实用。

著录项

  • 作者

    Song, Kai;

  • 作者单位
  • 年度 2011
  • 总页数
  • 原文格式 PDF
  • 正文语种 {"code":"en","name":"English","id":9}
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号